53 research outputs found

    Feature selection for microarray gene expression data using simulated annealing guided by the multivariate joint entropy

    Get PDF
    In this work a new way to calculate the multivariate joint entropy is presented. This measure is the basis for a fast information-theoretic based evaluation of gene relevance in a Microarray Gene Expression data context. Its low complexity is based on the reuse of previous computations to calculate current feature relevance. The mu-TAFS algorithm --named as such to differentiate it from previous TAFS algorithms-- implements a simulated annealing technique specially designed for feature subset selection. The algorithm is applied to the maximization of gene subset relevance in several public-domain microarray data sets. The experimental results show a notoriously high classification performance and low size subsets formed by biologically meaningful genes.Postprint (published version

    Feature selection in proton magnetic resonance spectroscopy data of brain tumors

    Get PDF
    In cancer diagnosis, classification of the different tumor types is of great importance. An accurate prediction of different tumor types provides better treatment and may minimize the negative impact of incorrectly targeted toxic or aggressive treatments. Moreover, the correct prediction of cancer types using non-invasive information –e.g. 1H-MRS data– could avoid patients to suffer collateral problems derived from exploration techniques that require surgery. A Feature Selection Algorithm specially designed to be use in 1H-MRS Proton Magnetic Resonance Spectroscopy data of brain tumors is presented. It takes advantage of a highly distinctive aspect in this data: some metabolite levels are notoriously different between types of tumors. Experimental read- ings on an international dataset show highly competitive models in terms of accuracy, complexity and medical interpretability.Postprint (author’s final draft

    TFS: a thermodynamical search algorithm for feature subset selection

    Get PDF
    This work tackles the problem of selecting a subset of features in an inductive learning setting, by introducing a novel Thermodynamic Feature Selection algorithm (TFS). Given a suitable objective function, the algorithm makes uses of a specially designed form of simulated annealing to find a subset of attributes that maximizes the objective function. The new algorithm is evaluated against one of the most widespread and reliable algorithms, the Sequential Forward Floating Search (SFFS). Our experimental results in classification tasks show that TFS achieves significant improvements over SFFS in the objective function with a notable reduction in subset size.Peer ReviewedPostprint (published version

    Machine learning methods for classifying normal vs. tumorous tissue with spectral data

    Get PDF
    Machine learning is a powerful paradigm within which to analyze 1H-MRS spectral data for the automated classi¯cation of tumor pathologies aimed to facilitate clinical diagnosis. The high dimensionality of the involved data sets makes the discover of computational models a challenging task. In this study we apply a feature selection algorithm in order to reduce the complexity of the problem. The obtained experimental results yield a remarkable classification performance of the final induced models, both in terms of prediction accuracy and number of involved spectral frequencies. A dimensionality reduction technique that preserves the class discrimination capabilities is used for the visualization of the final selected frequencies, thus enhancing their interpretability.Peer ReviewedPostprint (author’s final draft

    Differentiation of glioblastomas and metastases using 1H-MRS spectral data

    Get PDF
    Hydrogen-1 magnetic resonance spectroscopy (1H-MRS) allows noninvasive in vivo quantification of metabolite concentrations in brain tissue. In this work two of the most aggressive brain tumors are studied with the purpose of differentiating them. The challenging aspect in this task resides in that their radiological appearance is often similar, despite the fact that treatment of patients suffering these conditions is quite different. Efforts to differentiate between these two profiles are getting increasing attention, mainly because the consequences of performing an incorrect diagnosis. Due to the high dimensionality, initiatives oriented to reduce the description complexity become important. In this work we present a feature selection algorithm that generates relevant subsets of spectral frequencies. Experimental results deliver models that are both simple in terms of numbers of frequencies and show good generalization capabilities.Postprint (author’s final draft

    Graphical Framework for Categorizing Data Capabilities and Properties of Objects in the Internet of Things

    Get PDF
    Things are the core of the Internet of Things (IoT) and must be properly characterized according to the different functions they accomplish. Identifying their capabilities and combining them as sets provides a view on the single or joint properties of existing things and guide in properly designing and building new things while maximizing their potential benefits within an IoT system or application. Building on five essential but independent capabilities of things (Identification, Localization, Sensing, Actuation, and Processing), four categories or groups of things are defined. These groups comprise a particular view of the diversity of objects found in the IoT, as trackable, data, interactive, or smart objects. In this paper, a description of the aforementioned capabilities is presented, stating how each of the groups of objects includes them. Then, given that data are the most important assets for both organizations and individuals a further description of the data objects group is made, proposing a graphical categorization framework that thoroughly describes and measures the level in which each of these capabilities is contained and how it contributes to the performance and data properties of any data object

    Gene discovery for facioscapulohumeral muscular dystrophy by machine learning techniques

    Get PDF
    Facioscapulohumeral muscular dystrophy (FSHD) is a neuromuscular disorder that shows a preference for the facial, shoulder and upper arm muscles. FSHD affects about one in 20-400,000 people, and no effective therapeutic strategies are known to halt disease progression or reverse muscle weakness or atrophy. Many genes may be incorrectly regulated in affected muscle tissue, but the mechanisms responsible for the progressive muscle weakness remain largely unknown. Although machine learning (ML) has made significant inroads in biomedical disciplines such as cancer research, no reports have yet addressed FSHD analysis using ML techniques. This study explores a specific FSHD data set from a ML perspective. We report results showing a very promising small group of genes that clearly separates FSHD samples from healthy samples. In addition to numerical prediction figures, we show data visualizations and biological evidence illustrating the potential usefulness of these results.Peer ReviewedPostprint (published version

    Evidencia Empírica de la Minería de Procesos en la Implantación de CMMI-DEV

    Get PDF
    Resumen: La minería de procesos tiene como objetivo descubrir, monitorear y mejorar procesos a través del análisis de los diversos registros de eventos generados por los procesos de la organización. El objetivo de este trabajo es presentar la evidencia empírica de la inclusión estratégica de la disciplina de minería de procesos en proyectos de mejora de procesos de software implementados con CMMI. En el proceso de mapeo sistemático de la revisión de la literatura, se establecieron cuatro categorías para clasificar los hallazgos encontrados (Fundamentos teóricos, propuestas, herramientas y sistemas de información y algoritmos) para presentar los estudios que cumplen con el objetivo. Se concluye que la interdisciplinariedad de la minería de procesos con un modelo de referencia de procesos como CMMI-DEV apoya la implementación y evaluación de las áreas de procesos, al aplicar técnicas y algoritmos de minería de procesos que faciliten la exploración y explotación de los registros de eventos relacionados a la ejecución de las actividades almacenados en un repositorio. Palabras clave: Minería de Procesos, Mejora de Procesos de Software, Registro de Eventos

    Revisión y Control del Plan de Vigilancia Ambiental de las obras de dragado del Puerto de Maó

    Get PDF
    Se integra información hidrográfica, geomorfológica, sedimentológica y biológica, para la caracterización de los ecosistemas marinos en el punto de vertido y área adyacente previa al inicio de las obras de dragado del Puerto de Maó.RESUMEN: En este documento se presentan los trabajos científicos realizados por el Instituto Español de Oceanografía, dentro del Plan de Vigilancia Ambiental del dragado del Puerto de Maó (Menorca, Islas Baleares), para la caracterización de los ecosistema marino en el punto de vertido y área adyacente, previa al inicio de las obras. Se incluyen los resultados y las conclusiones de los estudios realizados por diversos grupos de investigación, principalmente en Enero-Marzo 2014, en relación al fondo marino, la hidrodinámica, las praderas de Posidonia oceanica y el molusco bivalvo Pinna nobilis, el macro-bentos de los fondos circalitorales blandos y los contaminantes en agua, sedimentos y biota, así como en especies de interés comercial para el consumo humano. Este informe se contempla en el contrato entre la Autoridad Portuaria de Baleares y el Instituto Español de Oceanografía, suscrito el 5 Febrero 2014, para los trabajos de asistencia técnica para la revisión y control del Plan de Vigilancia Ambiental del dragado del Puerto de Maó.RESUM: En aquest document es presenten els treballs científics realitzats per l’Instituto Español de Oceanografía, dins del Pla de Vigilància Ambiental del dragat del Port de Maó (Menorca, Illes Balears), per a la caracterització dels ecosistemes marins en el punt de vessament i àrea adjacent, prèvia a l’inici de les obres. S’inclouen els resultats i les conclusions del estudis realitzats per diversos grups de recerca, principalment durant Gener-Març 2014, en relació al fons marí, la hidrodinàmica, les praderies de Posidonia oceanica i el mol•lusc bivalve Pinna nobilis, el macro-bentos dels fons circalitorals tous i els contaminants en aigua, sediments i biota, així com en espècies d’interès comercial pel consum humà. Aquest informe es contempla en el contracte entre l’Autoritat Portuària de Balears i el Instituto Español de Oceanografía, subscrit el 5 Febrer 2014, pels treballs d’assistència tècnica per a la revisió i control del Pla de Vigilància Ambiental del dragat de Port de Maó.ABSTRACT: This document presents the scientific actions developed by the Instituto Español de Oceanografía within the Environmental Monitoring Plan of the works of dredging the Port of Maó (Minorca, Balearic Islands), for the characterization of the marine ecosystems in the point of discharge of dredged material and adjacent area, before the beginning of the dredging. The results and conclusions of the studies developed by several research groups, mainly in January-March 2014, in relation to the bottom, hydrodynamics, Posidonia oceanica meadows, and the bivalve mollusc Pinna nobilis, the macro-benthos of the circalittoral soft bottoms and the contaminants in water, sediments and biota, as well as in commercial species for human consumption, are included. This report is contemplated within the contract between the Autoridad Portuaria de Baleares and the Instituto Español de Oceanografía, signed on 5 February 2014, for the technical assistance activities to review and control the Environmental Monitoring Plan of the works of dredging the Port of Maó.Autoridad Portuaria de Baleare
    corecore